Syntax-Based Deep Matching of Short Texts
نویسندگان
چکیده
Many tasks in natural language processing, ranging from machine translation to question answering, can be reduced to the problem of matching two sentences or more generally two short texts. We propose a new approach to the problem, called Deep Match Tree (DEEPMATCHtree), under a general setting. The approach consists of two components, 1) a mining algorithm to discover patterns for matching two short-texts, defined in the product space of dependency trees, and 2) a deep neural network for matching short texts using the mined patterns, as well as a learning algorithm to build the network having a sparse structure. We test our algorithm on the problem of matching a tweet and a response in social media, a hard matching problem proposed in [Wang et al., 2013], and show that DEEPMATCHtree can outperform a number of competitor models including one without using dependency trees and one based on word-embedding, all with large margins.
منابع مشابه
Fast-Syntax-Matching-Based Japanese-Chinese Limited Machine Translation
Limited machine translation (LMT) is an unliterate automatic translation based on bilingual dictionary and sentence bank. This paper addresses the Japanese-Chinese LMT problem, proposes two syntactic hypotheses about Japanese texts, and designs a fast-syntax-matching-based Japanese-Chinese (FSMJC) LMT algorithm. In which, the fast syntax matching function, a modified version of Levenshtein func...
متن کاملMechanism to Enrich Short Texts Query Search Engine in Large-Scale Data Collection
Retrieving semantic similar short texts is a crucial issue to many applications, e.g., web search, ads matching, question answer system, and so forth. Most of the traditional methods concentrate on how to improve the precision of the similarity measurement, while current real applications need to efficiently explore the top similar short texts semantically related to the query one. We address t...
متن کاملA Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is ...
متن کاملMatch-SRNN: Modeling the Recursive Matching Structure with Spatial RNN
Semantic matching, which aims to determine the matching degree between two texts, is a fundamental problem for many NLP applications. Recently, deep learning approach has been applied to this problem and significant improvements have been achieved. In this paper, we propose to view the generation of the global interaction between two texts as a recursive process: i.e. the interaction of two tex...
متن کاملA Fast Matching Method Based on Semantic Similarity for Short Texts
As the emergence of various social media, short texts, such as weibos and instant messages, are very prevalent on today’s websites. In order to mine semantically similar information from massive data, a fast and efficient matching method for short texts has become an urgent task. However, the conventional matching methods suffer from the data sparsity in short documents. In this paper, we propo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015